An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels
نویسندگان
چکیده
Multi-label classification has received considerable interest in recent years. Multi-label classifiers have to address many problems including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To tackle datasets with a large set of labels, embedding-based methods have been proposed which seek to represent the label assignments in a low-dimensional space. Many state-of-the-art embedding-based methods use a linear dimensionality reduction to represent the label assignments in a low-dimensional space. However, by doing so, these methods actually neglect the tail labels labels that are infrequently assigned to instances. We propose an embedding-based method that non-linearly embeds the label vectors using an stochastic approach, thereby predicting the tail labels more accurately. Moreover, the proposed method have excellent mechanisms for handling missing labels, dealing with large-scale datasets, as well as exploiting unlabeled data. With the best of our knowledge, our proposed method is the first multi-label classifier that simultaneously addresses all of the mentioned challenges. Experiments on real-world datasets show that our method outperforms stateof-the-art multi-label classifiers by a large margin, in terms of prediction performance, as well as training time.
منابع مشابه
Towards Multi Label Text Classification through Label Propagation
Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...
متن کاملSemi-supervised Multi-label Learning Algorithm Using Dependency Among Labels
In this paper, we present a semi-supervised algorithm for multi-label learning by exploring the relationship among labels. Based on the accuracy, we determine the classification order for labels, a list of classifiers is trained by this order, with each classifier being trained by using the outputs of the previous classifiers in the list as additional input features. Experiments on three multi-...
متن کاملMulti Label Text Classification through Label Propagation
Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...
متن کاملMulti Label Spatial Semi Supervised Classification using Spatial Associative Rule Mining and Evolutionary Algorithms
Multi-label spatial classification based on association rules with multi objective genetic algorithms (MOGA) enriched by semi supervised learning is proposed in this paper. It is to deal with multiple class labels problem. In this paper we adapt problem transformation for the multi label classification. We use hybrid evolutionary algorithm for the optimization in the generation of spatial assoc...
متن کاملExploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1606.05725 شماره
صفحات -
تاریخ انتشار 2016